2nd Edition

Graphics for Statistics and Data Analysis with R

By Kevin J. Keen Copyright 2018
    610 Pages
    by Chapman & Hall

    Praise for the First Edition

    "The main strength of this book is that it provides a unified framework of graphical tools for data analysis, especially for univariate and low-dimensional multivariate data. In addition, it is clearly written in plain language and the inclusion of R code is particularly useful to assist readers’ understanding of the graphical techniques discussed in the book. … It not only summarises graphical techniques, but it also serves as a practical reference for researchers and graduate students with an interest in data display." -Han Lin Shang, Journal of Applied Statistics

    Graphics for Statistics and Data Analysis with R, Second Edition, presents the basic principles of graphical design and applies these principles to engaging examples using the graphics and lattice packages in R. It offers a wide array of modern graphical displays for data visualization and representation. Added in the second edition are coverage of the ggplot2 graphics package, material on human visualization and color rendering in R, on screen, and in print.

    Features

    • Emphasizes the fundamentals of statistical graphics and best practice guidelines for producing and choosing among graphical displays in R
    • Presents technical details on topics such as: the estimation of quantiles, nonparametric and parametric density estimation; diagnostic plots for the simple linear regression model; polynomial regression, splines, and locally weighted polynomial regression for producing a smooth curve; Trellis graphics for multivariate data
    • Provides downloadable R code and data for figures at www.graphicsforstatistics.com

    Kevin J. Keen is a Professor of Mathematics and Statistics at the University of Northern British Columbia (Prince George, Canada) and an Accredited Professional StatisticianTM by the Statistical Society of Canada and the American Statistical Association.

    List of Figures

    List of Tables

    Preface to the First Edition

    Preface to the Second Edition

    Acknowledgments

    I Introduction

    The Graphical Display of Information

    Introduction

    Learning Outcomes

    Know the Intended Audience

    Principles of Effective Statistical Graphs

    The Layout of a Graphical Display

    The Design of Graphical Displays

    Graphicacy

    The Grammar of Graphics

    Graphical Statistics

    Conclusion

    Exercises

    II A Single Discrete Variable

    Basic Charts for the Distribution of a Single Discrete Variable

    Introduction

    Learning Outcomes

    An Example from the United Nations

    The Dot Chart

    The Bar Chart

    Definition

    Pseudo Three-Dimensional Bar Chart

    The Pie Chart

    Definition

    Pseudo Three-Dimensional Pie Chart

    Recommendations Concerning the Pie Chart

    Conclusion

    Exercises

    Advanced Charts for the Distribution of a Single Discrete Variable

    Introduction

    Learning Outcomes

    The Stacked Bar Chart

    Definition

    The Stacked Bar Plot Versus the Bar Chart and the Pie Chart

    The Pictograph

    Definition

    The Pictograph Versus the Dot Chart and the Bar Chart

    Variations on the Dot and Bar Charts

    The Bar-Whisker Chart

    Dot-Whisker Chart

    Frames, Grid Lines, and Order

    Frame

    Grid Lines

    Order

    Conclusion

    Exercises

    III A Single Continuous Variable

    Exploratory Plots for the Distribution of a Single Continuous Variable

    Introduction

    Learning Outcomes

    The Dotplot

    Definition

    Variations on the Dotplot

    The Stemplot

    Definition

    The Boxplot

    Definition

    Variations on the Boxplot

    The EDF Plot

    Definition

    The EDF Plot as a Diagnostic Tool

    Conclusion

    Exercises

    Diagnostic Plots for the Distribution of a Continuous Variable

    Introduction

    Learning Outcomes

    The Quantile-Quantile Plot

    The Probability Plot

    Estimation of Quartiles and Percentiles∗

    Estimation of Quartiles

    Estimation of Percentiles

    Conclusion

    Exercises

    Nonparametric Density Estimation for a Single Continuous Variable

    Introduction

    Learning Outcomes

    The Histogram

    Definition

    A Circular Variation on the Histogram: The Rose Diagram

    Kernel Density Estimation∗

    Spline Density Estimation∗

    Choosing a Plot for a Continuous Variable∗

    Conclusion

    Exercises

    Parametric Density Estimation for a Single Continuous Variable

    Introduction

    Learning Outcomes

    Normal Density Estimation

    Transformations to Normality

    Pearson’s Curves∗

    Gram-Charlier Series Expansion∗

    Conclusion

    Exercises

    IV Two Variables

    Depicting the Distribution of Two Discrete Variables

    Introduction

    Learning Outcomes

    The Grouped Dot Chart

    The Grouped Dot-Whisker Chart

    The Two-Way Dot Chart

    The Multi-Valued Dot Chart

    The Side-by-Side Bar Chart

    The Side-by-Side Bar-Whisker Chart

    The Side-by-Side Stacked Bar Chart

    The Side-by-Side Pie Chart

    The Mosaic Chart

    Conclusion

    Exercises

    Depicting the Distribution of One Continuous Variable and One Discrete Variable

    Introduction

    Learning Outcomes

    The Side-by-Side Dotplot

    The Side-by-Side Boxplot

    The Notched Boxplot

    The Variable-Width Boxplot

    The Back-to-Back Stemplot

    The Side-by-Side Stemplot

    The Side-by-Side Dot-Whisker Plot

    The Trellis Kernel Density Estimate∗

    Conclusion

    Exercises

    Depicting the Distribution of Two Continuous Variables

    Introduction

    Learning Outcomes

    The Scatterplot

    The Sunflower Plot

    The Bagplot

    The Two-Dimensional Histogram

    Definition

    The Levelplot

    The Cloud Plot

    Two-Dimensional Kernel Density Estimation∗

    Definition

    The Contour Plot

    The Wireframe plot

    Conclusion

    Exercises

    V Statistical Models for Two or More Variables

    Simple Linear Regression: Graphical Displays

    Introduction

    Learning Outcomes

    The Simple Linear Regression Model

    Definition

    The Scatterplot

    The Sunflower Plot

    Residual Analysis

    Definition

    Residual Scatterplots

    Depicting the Distribution of the Residuals

    Depicting the Distribution of the Semistandardized Residuals

    Influence Analysis

    Definition

    Matrix Notation for the Simple Linear Regression Model

    Depicting Standardized Residuals

    Depicting the Distribution of Studentized Residuals

    Depicting Leverage

    Depicting DFFITS

    Depicting DFBETAS

    Depicting Cook’s Distance

    Influence Plots

    Conclusion

    Exercises

    Polynomial Regression and Data Smoothing: Graphical Displays

    Introduction

    Learning Outcomes

    The Polynomial Regression Model

    Splines

    Locally Weighted Polynomial Regression

    Conclusion

    Exercises

    Visualizing Multivariate Data

    Introduction

    Learning Outcomes

    Depicting Distributions of Three or More Discrete Variables

    The Sinking of the Titanic

    Thermometer Chart

    Three-Dimensional Bar Chart

    Trellis Three-Dimensional Bar Chart

    Depicting Distributions of One Discrete Variable and Two or More Continuous Variables

    Anderson’s Iris Data

    The Superposed Scatterplot

    The Superposed Three-Dimensional Scatterplot

    The Scatterplot Matrix

    The Parallel Coordinates Plot

    The Trellis Plot

    Observations of Multiple Variables

    OECD Healthcare Service Data

    Chernoff’s Faces

    The Star Plot

    The Rose Plot

    The Multiple Linear Regression Model

    Definition

    Modeling Perch Mass

    Residual Scatterplot Matrix

    Leverage Scatterplot Matrix

    Influence Plot

    Partial-Regression Scatterplot Matrix

    Partial-Residual Scatterplot Matrix

    Summary of the Model for Perch Mass

    Conclusion

    Exercises

    VI Appendices

    Human Visualization

    Introduction

    Learning Outcomes

    Optics

    Introduction

    Geometrical Optics

    The Light Spectrum

    Anatomy of the Human Eye

    The Perception of Colour

    Graphical Perception

    Weber’s Law

    Stevens’s Law

    The Gestalt Laws of Organization

    Kosslyn’s Image Processing Model

    Conclusion

    Exercises

    Color Rendering

    Introduction

    Learning Outcomes

    RGB and XYZ Color Spaces

    HSL and HSV Color Spaces

    CIELAB and CIELUV Color Spaces

    HCL Color Space

    CMYK Color Space

    Displaying Color in R

    Saving Color Documents from R

    Conclusion

    Exercises

    Bibliography

    Index

    Biography

    Kevin J. Keen is a Professor of Mathematics and Statistics at the University of Northern British Columbia (Prince George, Canada) and an Accredited Professional StatisticianTM by the Statistical Society of Canada and the American Statistical Association.

    "A leading expert wrote the book. The book is an exposition of statistical methodology that focuses on ideas and concepts and makes extensive use of graphical presentation, but readers should have some prior experience of statistical methodology. The chapters also contain many exercises with solutions and hints presented in the Appendix. The R codes are available for download on the website. The book presents data and Programmes to replicate the models developed, offers new methods that are ready to use, and explores graphical statistics in its entirety from the fundamentals of modern methods. The book is also a complete reference manual and should be considered a must-have companion for the interested advanced audience."
    ~International Society for Clinical Biostatistics

     

    ". . . this is a book I can recommend for consideration in a course or as a course supplement. It is generally clear and well-written, and the statistical aspects of some of these methods are explained in sufficient detail to put these in context."
    ~Michael Friendly, Journal of Agricultural, Biological, and Environmental Statistics